Utilising policy types for effective ad hoc coordination in multiagent systems
نویسنده
چکیده
This thesis is concerned with the ad hoc coordination problem. Therein, the goal is to design an autonomous agent which can achieve high flexibility and efficiency in a multiagent system that admits no prior coordination between the designed agent and the other agents. Flexibility describes the agent’s ability to solve its task with a variety of other agents in the system; efficiency is the relation between the agent’s payoffs and time needed to solve the task; and no prior coordination means that the agent does not a priori know how the other agents behave. This problem is relevant for a number of practical applications, including human-machine interaction tasks, such as adaptive user interfaces, robotic elderly care, and automated trading agents. Motivated by this problem, the central idea studied in this thesis is to utilise a set of policies, or types, to characterise the behaviour of other agents. Specifically, the idea is to reduce the complexity of the interaction problem by assuming that the other agents draw their latent type from some known or hypothesised space of types, and that the assignment of types is governed by an unknown distribution. Based on the current interaction history, we can form posterior beliefs about the relative likelihood of types. These beliefs, combined with the future predictions of the types, can then be used in a planning procedure to compute optimal responses. The aim of this thesis is to study the potential and limitations of this idea in the context of ad hoc coordination. We formulate the ad hoc coordination problem using a game-theoretic model called the stochastic Bayesian game. Based on this model, we derive a canonical algorithmic description of the idea outlined above, called Harsanyi-Bellman Ad Hoc Coordination (HBA). The practical potential of HBA is demonstrated in two case studies, including a human-machine experiment and a simulated logistics domain. We formulate basic ways to incorporate evidence (i.e. observed actions) into posterior beliefs and analyse the conditions under which the posterior beliefs converge to the true distribution of types. Furthermore, we study the impact of prior beliefs over types (that is, before any actions are observed) on the long-term performance of HBA, and show empirically that automatic methods can compute prior beliefs with consistent performance effects. For hypothesised (i.e. “guessed”) type spaces, we analyse the relations between hypothesised and true type spaces under which HBA is still guaranteed to solve its task, despite inaccuracies in hypothesised types. Finally, we show how HBA can perform an automatic statistical analysis to decide whether to reject its behavioural hypothesis, i.e. the combination of posterior beliefs and types.
منابع مشابه
Ad hoc coordination in multiagent systems with applications to human-machine interaction
This thesis is concerned with the ad hoc coordination problem, in which the goal is to design an autonomous agent which is able to achieve optimal flexibility and efficiency in a multiagent system with no mechanisms for prior behavioural coordination. The thesis is primarily motivated by humanmachine interaction problems, which can often be formulated in this setting. This paper gives a brief a...
متن کاملAd Hoc Coordination in Multiagent Systems with Applications to Human-Machine Interaction (Doctoral Consortium)
This thesis is concerned with the ad hoc coordination problem, in which the goal is to design an autonomous agent which is able to achieve optimal flexibility and efficiency in a multiagent system with no mechanisms for prior behavioural coordination. The thesis is primarily motivated by humanmachine interaction problems, which can often be formulated in this setting. This paper gives a brief a...
متن کاملCommunicating Intentions for Coordination with Unknown
Ad hoc multiagent teamwork introduces the challenge of coordinating with a variety of potential teammates, including teammates with unknown behavior. We examine the communication of policy information for enhanced coordination between such agents. The proposed decision-theoretic approach examines the uncertainty within a model of an unfamiliar teammate, identifying and acquiring policy informat...
متن کاملPolicy Communication for Coordination with Unknown Teammates
Within multiagent teams research, existing approaches commonly assume agents have perfect knowledge regarding the decision process guiding their teammates’ actions. More recently, ad hoc teamwork was introduced to address situations where an agent must coordinate with a variety of potential teammates, including teammates with unknown behavior. This paper examines the communication of intentions...
متن کاملModels and Algorithms for Ad Hoc Coordination in Multiagent Systems
The goal of this thesis is develop new models and algorithms for the ad hoc coordination problem. Therein, the problem is to design an autonomous agent which is able to achieve optimal flexibility and efficiency in a multiagent system in which the behaviour of the other agents is not a priori known. This problem is motivated by applications such as human-machine interaction, robot search and re...
متن کامل